Experimental nonclassicality in a causal network without assuming freedom of choice

In a Bell experiment, it is natural to seek a causal account of correlations wherein only a common cause acts on the outcomes. For this causal structure, Bell inequality violations can be explained only if causal dependencies are modeled as intrinsically quantum. There also exists a vast landscape of causal structures beyond Bell that can witness nonclassicality, in some cases without even requiring free external inputs. Here, we undertake a photonic experiment realizing one such example: the triangle causal network, consisting of three measurement stations pairwise connected by common causes and no external inputs. To demonstrate the nonclassicality of the data, we adapt and improve three known techniques: (i) a machine-learning-based heuristic test, (ii) a data-seeded inflation technique generating polynomial Bell-type inequalities and (iii) entropic inequalities. The demonstrated experimental and data analysis tools are broadly applicable paving the way for future networks of growing complexity.

In the manuscript entitled "Experimental nonclassicality in the triangle causal network", the authors report an experimental violation of local causality in the triangle scenario. In particular, the authors experimentally realize a tripartite distribution and demonstrate its contradiction with the prediction by any local-causal model. For this goal, the authors use an experimental setup with three independent sources for generating entangled photons, and apply two different data-analysis methods. The experimental setup is similar to that used in their previous work (Ref. [43] of the manuscript). The two data-analysis methods are the machine-learning-based test (developed in Ref. [31]) and the inflation technique (developed in Refs. [12,53]). My specific comments are as follows.
First, as to the experimental realization of the tripartite distribution firstly proposed by Fritz, the Fritz distribution reveals a new form of nonclassicality in causal networks, that is, nonclassicality without external inputs. However, in the experimental realization reported in this work, there are external inputs to determine the local measurements on the entangled photons shared between Alice and Bob. To simulate the Fritz distribution, the authors post-select the events where the external inputs coincide with the measurement outcomes at the same stations but on the photons from different sources. In my opinion, this is not a faithful realization of the Fritz distribution. Moreover, if there are external inputs for Alice and Bob to determine their local measurements, it is not necessary to bother a third party Charlie to witness nonclassicality as Alice's and Bob's local measurements with external inputs realize a bipartite distribution which maximally violates the CHSH Bell inequality.
Second, since the data-analysis techniques used in this work were well developed in the previous works (Refs. [12,43,53] of the manuscript), the statement in the abstract "To demonstrate the nonclassicality of our data, we introduce two new techniques ..." is confusing. If I am wrong, please specify in which aspects there are new contributions to these techniques. In addition, if I understand, the machine-learning-based technique may not find the optimal local-causal model with respect to an experimentally observed distribution while the inflation technique guarantees the reliability of the nonclassicality witnessed by experimental data. It is helpful to clarify these points in the manuscript.
Third, I also have several minor comments: i) Can the authors specify the physical distances between any two of Alice, Bob, and Charlie in Figure 2 (b)? Also, where are the three independent sources for entanglement generation located? ii) What is the "single higher pick" as mentioned in the caption of Figure 3? iii) Can the authors justify the statement "When more than three two-fold coincidences are found in the same time window, the additional events are discarded" in Supplementary Note 2?
If the authors can fix the issue detailed in my first comment and clarify their new contributions in data analysis if there are, the work could be appropriate for Nature Communications. Otherwise, I recommend the authors submit this work to a less prestigious journal.
Reviewer #2 (Remarks to the Author): In this manuscript the authors present an experiment demonstrating nonclassical correlations in a triangle network. The triangle network is a three-party network wherein one assumes causal connections between the three parties such that they are connected by three independent sources in the shape of a triangle. They experimentally achieve this with three independent entangled-pair sources where each source connects two parties. The measurements made within each party are simple separable measurements. It was shown in reference [5] that this situation can lead to non-classical correlations. This work provides tools to analyze such situations, which go beyond the now standard causal network in a Bell scenario. The work shows how one can use a neural network to model classical correlations in such networks, and derive so-called "causal compatibility inequalites". I find the work to relevant and interesting, and I support its publication in Nature Communications. I do have a few points I think the authors should nevertheless address.
1) I'm a little confused about the source independence. The authors state "To justify the assumption of source independence, it is essential to use non-synchronized lasers to pump the generation crystals that act as sources." They then go on to say that experiments which did not do this require additional device-dependent justifications to claim source independence. However, it seems to me the knowledge that different lasers are used also device dependent knowledge? Furthermore, even if I had photons generated using different pump lasers, I could imagine some non-local interaction which later generates correlations between the photons. I think ideally one would devise some set of measurements to verify this condition.
2) I think the authors should include a few more experimental details in the main body, or the methods section. The use of the three uncorrelated sources using both CW and pulsed lasers is quite unusual. I did not see any remarks on the 6-fold rate or the total acquisition time. The only remark related to this is total counts acquired which is mentioned in the supplement. These are very important as the total number of counts acquired is used to calculate their error bars. Furthermore, the 63us window is quite large and, again, unusual. The authors describe there choice of the window well in the supplement, but at small remark about its significance should be made in the main body. They could also comment on how this choice affects their 6-fold rate (even in the supplement).
3) At the start of the "Fritz Distribution measurement" section, the authors say "Based on the result of this measurement, one of the two Bell observables … are measured." This confused me, as it seems to imply some sort of feed-forward, and earlier in the paper the authors say "Here, in stark contrast to the Bell scenario, the parties implement a single measurement, rather than implementing one of a set of incompatible measurements." Can the authors clarify this point?
Reviewer #3 (Remarks to the Author): A causal network consists of sources emitting signals, and parties performing measurements on the received signals. The question of characterizing correlations among measurement outcomes of the parties that are compatible with a given causal network is of fundamental interest, particularly because of the phenomenon of network nonlocality.
In the paper under review, the authors consider the triangle network and experimentally demonstrate that this network exhibits nonlocality. The triangle network consists of three parties any pair of which share a source. Fritz's distribution is a particular distribution on the outcomes of the three parties that is rooted in the CHSH correlation, and is a nonlocal distribution. The authors in this paper, perform an experiment to observe Fritz's distribution in the triangle network using sources emitting entangled states. The resulting correlation is of course a noisy version of Fritz's distribution. Then the authors give evidences/proofs to show that even this noisy distribution is nonlocal.
The authors first use machine learning techniques to find optimal response functions in a hypothetical classical scenario that may realize the noisy Fritz's distribution. They conclude that with these methods one cannot get close to the desired distribution. This method, although heuristic gives an evidence that the outcome distribution of the experiment is nonlocal.
The next approach of the authors is via the inflation technique. Inflation is a method for deriving a hierarchy of necessary conditions on local distributions, which interesting are sufficient too. The authors use a second order inflation which given the outcome distribution of the experiment, is a linear program. The authors observe that this linear program is infeasible and conclude that the distribution is nonlocal. To prove infeasibility they use Farkas lemma, and explicitly find an inequality that holds for all local correlations but is violated by the distribution under study. I found this part of the paper particularly interesting.
I have a major concern about this paper. In the experimental implementation of Fritz's distribution, the authors take all the shared sources to be entangled singlet states. However, in principle two of the sources can be assumed to be classical. While classical sources can be simulated by quantum ones, this would introduce extra noise. By replacing these two sources with classical ones, the associated measurement outcomes would have perfect anti-correlation, resulting in much less noise in the outcome distribution. In fact, it seems to me that the authors used a complicated experimental setup for something that could be done much easier and with a reduced noise. Having said that, even if the author would've replaced two of the quantum sources with classical ones, I'm not sure if that experiment would've been interesting since such an experiment would be essentially a Bell experiment. I have three other main comments: As far as I understand, an important parameter in the neural networks considered in the paper is the size of inputs; the alphabet sizes of hidden variables Lambda_{AB}, Lambda_{BC} and Lambda_{AC}. Maybe I'm missing something, but I don't see any comment on this in the paper.
In the inflation part, the dual vector obtained via Farkas lemma is quite interesting since it is very symmetric. I wished the authors would've commented more on the features of this vector and the resulting inequality. In particular, I liked to see an analytic proof of that inequality. My point is that the nonlocality of Fritz's distribution is rooted in the CHSH inequality. Moreover, as far as I understand, the CHSH inequality itself can be proven using a second order inflation. Thus it is not a surprise that the locality of a noisy version of Fritz's distribution can be refuted using a second order inflation. But since the inequality obtained via this method is quite symmetric (and the CHSH inequality has an analytic proof), it would be interesting to find a direct analytic proof of the new inequality too. Such a proof may advance our understating of network nonlocality.
It also would be interesting if the authors would've compared the results of the two methods. The machine learning technique gives a (heuristic) bound on the visibility of the distribution under study. The inequality found via the inflation technique also gives such a bound. Now my question is how these two bounds compare to each other?
And here are some minor comments: Line 66: what "new set of data analysis techniques" refer to? If they are the machine learning and inflation techniques, both of them appear in previous works. Line 99: "is demonstrate" -> to demonstrate Line 117: "answers the question" -> and answers Line 185: "Positive-valued-measure" -> positive operator-valued measure Line 206: shouldn't a_0 in "p(a_1, b_1| a_0, c_0)" be c_1? First paragraph of Supplementary Note 4: "an ansatz causal structures" Supplementary Note 6 is very brief and is not clear to me. Expanding this part and e.g. giving the proof of equation ( OJXU`\_d S\UQb\i cX_gc dXQd Q cYW^Y#SQ^d fY_\QdY_^SQ^RU _RdQY^UT V_b Q\\ dXU S_^cYTUbUT fQ\eUc _V dXU gY^T_g* bQ^WY^W Vb_] /$502 "c d_ 30$252 "c gYdX Q S_bbUc`_^TY^W TUdUSdUT 3+V_\T S_Y^SYTU^SU bQdU _V 0$26@j Q^T 0.$3/@j* bUc`USdYfU\i,&    I read through the replies to my previous comments and the changes made in the manuscript. I would like to first thank the authors for the detailed replies to my comments. I am satisfied with most of them. However, I have a different opinion on the potential problems caused by post-selecting the events where the external input for measuring a photon in a station coincides with the outcome of measuring another photon at the same station. The authors think that this post-selection is the same as the postselection due to the low detection efficiency in a Bell test. On the other hand, I think that the uses of external inputs and the corresponding post-selection introduce more fundamental problems than the detection loophole. First, the triangle causal structure studied in this work does not require any external input, while in the experiment reported in this work there are external inputs (although the authors didn't call them external inputs explicitly). How can one justify the experiment as a faithful realization of the triangle causal structure studied? Here I apologize for the unclear statement "this is not a faithful realization of the Fritz distribution" in my previous report. I meant "this is not a faithful realization of the triangle causal structure". Second, the post-selection due to the low detection efficiency can be justified by the fair sampling condition (that is, the sub-ensemble of detected particles is a fair representation of the whole ensemble). Can one justify the post-selection based on the coincidence between the external input and measurement outcome in the same way? It seems to me that the post-selection based on coincidence introduces correlation or artificial causal connection between two events, while the fair sampling conveys the independence of particle behavior from detection efficiency.
Reviewer #2 (Remarks to the Author): I have read the revised manuscript, and I happy with all but one of the author's responses to my original comments. I still have some misgivings about the discussion of the source independence.
I appreciate the author's response and essentially agree with what they wrote in the letter. However, I find that their modification of the manuscript does not entirely reflect this. In particular, the following sentence I the paper stands out: "To justify the assumption of source independence, it is essential to use non-synchronized lasers to pump the generation crystals that act as sources." Of course, using different lasers makes this easier to justify on an intuitive level, but they have not convinced me that it is "essential". Are the lasers powered by the same circuit? Could correlations arise from electrical power supplies? Of course, given our current understanding of the experiment, this is most likely not the case. But it is also most likely not the case, given our current understanding of such experiments, that the same laser pumping three different non-linear crystals will generate correlations between the sources. Perhaps an argument could be made based on the spatial separation of the lasers, but that is not provided.
To be honest, I do not know exactly what they should say about this point. I understand their motivation to use three separate lasers, but it seems to me that this is simply a based on a feeling and not on the underlying physics. I worry that as it is written (worded very strongly, and without appropriate justification) the reader may take the authors' approach as a solution to this problem, when it may not be.

Reviewer #3 (Remarks to the Author):
Regarding my main comment on replacing two of the singlets with classical sources, the authors mention that: 1-This is "a feat rather than a failure of our implementation" and "the fact we have noisy data implies we cannot rely on a standard Bell test" 2-This setup "can in principle be useful also for realizing other quantum distributions" and "in this way, we demonstrate general analysis tools for a real and versatile quantum network, where each involved source is able to generate and distribute quantum states" In my opinion none of these points are convincing in making the experimental setup complicated. I agree that with this setup one needs new tools, beyond the CHSH inequality, to rule out classicality. Yet as far as I understand the main point of the paper is an experimental demonstration of nonlocality via Fritz's distribution. If it's about the the tools to prove nonlocality, as already raised in the reviews, they are more or less known and the argument that no one has applied them on real experimental data is not convincing to me. The authors also argue that this setup (with three shared entangled states) is useful to demonstrate nonlocality of other distributions in the triangle network. This is actually a good point, but why the authors didn't use this setup for other such distributions in the literature that do require three entangled states as shared sources.
Overall although I believe this paper has several interesting points and has developed techniques that would be useful in the study and experimental realization of network nonlocality, I cannot spot a strong selling point that makes it appropriate for Nature Communications. ,   6 READ THROUGH THE REPLIES TO MY PREVIOUS COMMENTS AND THE CHANGES MADE IN THE MANUSCRIPT+ 6  WOULD LIKE TO "RST THANK THE AUTHORS FOR THE DETAILED REPLIES TO MY COMMENTS+ 6 AM SATIS"ED WITH MOST  OF THEM+   IQ`TMZW`

REVIEWER COMMENTS
Reviewer #1 (Remarks to the Author): Thank the authors for considering my previous comment seriously. I read through the revised manuscript. As far as I can see, the authors have made efforts in order to address my concern; however, one point in the revised experiment is still unclear to me. In the experimental setup depicted in Fig. 2, the classical correlations between Alice and Charlie or between Bob and Charlie are simulated by distributing two random numbers generated by the two QRNGs in the figure.
Specifically, the generation of random numbers is entirely completed in the source station labeled Lambda_AC or Lambda_BC in the figure. After the random number generation, the same classical signal, i.e., the random number generated, is distributed to Alice and Charlie or Bob and Charlie. I don't think this is a faithful realization of the classically correlated state described in Eq. (6) of the main text. To my understanding, the use of QRNGs here is essentially the same as that in a standard Bell test-Alice and Bob each receive a free input choice. Hence, the revised experiment does not entirely address my concern. The analysis of the experiment reported here can be performed by showing a violation of the standard Bell inequality, in contrast to the authors' claim. For this work to be considered for publication in Nature Communications, the authors must implement the following in their experiment: the classically correlated states described in Eq. (6) should be truly realized in the source stations Lambda_AC or Lambda_BC, and these classically correlated states should be distributed and detected at the measurement stations (Alice, Bob, and Charlie) instead of the source stations.
Besides the above, I have a couple of comments on the data analysis techniques used in this work. First, for the standard Bell test, there are techniques for deriving Bell inequalities tailored to the specific data observed in an experiment, see arXiv:0905.2950 and Phys. Rev. A 84, 062118 (2011), for example. Several statements in the manuscript, for instance, the third paragraph in Discussion implies the absence of such techniques. Second, the newly added data analysis based on the violation of entropic inequalities needs more details-the current description of the method in Sect. IV C is too brief. For example, it is hard to understand the last sentence in the caption of Fig. 7. Also, there is no explanation of the meaning of the entropic inequality in Eq. (8). In addition, I would like to point out that Bell violation is robust against measurement dependence, see Phys. Rev. Lett. 113, 190402 (2014) for an illustration. This fact contradicts the last sentence in the first paragraph of Sect. IV C.
Reviewer #2 (Remarks to the Author): I find the new version of the manuscript, including the new experimental results, quite clear and pedagogical. The author's have completely addressed all of my previous comments.
In the new manuscript, I also find the narrative of the removal of the freedom of choice assumptions from a standard Bell test quite compelling. Hence, I recomemnd publication.
I have one related question, in this experiment the authors have not closed the locality and detection efficiency loopholes, and they have moved the freedom of choice loophole to a source independence loophole. They discuss the source independence loophole already, and the locality loophole is obvious, but for the detection efficiency how does the threshold in this setup compare to standard Bell tests?